My first HTML document

Loading [MathJax]/extensions/tex2jax.js

007_002_lab_DQN 2 (Nature 2015).html # https://www.youtube.com/watch?v=ByB49iDMiZE&list=PLlMkM4tgfjnKsCWav-Z2F-MMFRx-2gMGG&index=16 # @ # DQN (2013): # 1. Go deep # 1. You use replay memory storing values # to resolve correlations between samples # DQN (2015): # 1. Separate networks, # to resolve non-stationary target issue # Core idea of DQN 2013 # img 2018-04-29 16-50-01.png #

# Core idea of DQN 2015 # img 2018-04-29 16-50-37.png #

# @ # How to create separated networks # img 2018-04-29 16-51-24.png #

# @ # DQN vs targetDQN # img 2018-04-29 16-52-22.png #

# @ # How to handle two networks with codes # img 2018-04-29 16-54-35.png #

# @ # Copying network means copying value of weights # img 2018-04-29 16-56-59.png #

# @ # Summary # 1. You create 2 networks # 1. You make target same with main network # target=mainNet # 1. Environment, loop, ... # When you create y, you will use target network # You will update main network by using y # 1. You make target network same with main network, # by assigning main network into target network # @ # Code related to replay train (targetDQN added) # img 2018-04-29 17-01-39.png #

# @ # Code related to copy network (variable) # img 2018-04-29 17-02-23.png #

# Code related to bot play # img 2018-04-29 17-02-57.png #

# Code related to main() # img 2018-04-29 17-03-49.png #

# Exercise 1 # Tuning hyper parameters (learning rate, sample size, decay factor) # Network structure # add bias # test tanh, sigmoid, relu, etc # improve TF network to reduce sess.run() calls # Reward redesign # img 2018-04-29 17-06-37.png #

# Exercise 2 # Car race with DQN 2015 # img 2018-04-29 17-07-20.png #

# Exercise 3 # DQN implementtations # Other games # RMA approach # img 2018-04-29 17-08-00.png #